Simple Test Strategies for Cost-Sensitive Decision Trees
نویسندگان
چکیده
We study cost-sensitive learning of decision trees that incorporate both test costs and misclassification costs. In particular, we first propose a lazy decision tree learning that minimizes the total cost of tests and misclassifications. Then assuming test examples may contain unknown attributes whose values can be obtained at a cost (the test cost), we design several novel test strategies which attempt to minimize the total cost of tests and misclassifications for each test example. We empirically evaluate our treebuilding and various test strategies, and show that they are very effective. Our results can be readily applied to real-world diagnosis tasks, such as medical diagnosis where doctors must try to determine what tests (e.g., blood tests) should be ordered for a patient to minimize the total cost of tests and misclassifications (misdiagnosis). A case study on heart disease is given throughout the paper.
منابع مشابه
Cost-Sensitive Decision Trees with Pre-pruning
This paper explores two simple and efficient pre-pruning strategies for the cost-sensitive decision tree algorithm to avoid overfitting. One is to limit the cost-sensitive decision trees to a depth of two. The other is to prune the trees with a pre-specified threshold. Empirical study shows that, compared to the error-based tree algorithm C4.5 and several other cost-sensitive tree algorithms, t...
متن کاملRules-based Classification with Limited Cost
In test cost-sensitive decision systems, it is difficulty for us to find an optimal attribute set and construct a quality classifier with limited cost. The minimal test cost-sensitive attribute reduction is proposed to address the former problem. However, it is inevitable to remove some good even better attributes in the minimal test cost-sensitive attribute reduction. As a result, the classifi...
متن کاملHybrid Cost-Sensitive Decision Tree
Cost-sensitive decision tree and cost-sensitive naïve Bayes are both new cost-sensitive learning models proposed recently to minimize the total cost of test and misclassifications. Each of them has its advantages and disadvantages. In this paper, we propose a novel cost-sensitive learning model, a hybrid cost-sensitive decision tree, called DTNB, to reduce the minimum total cost, which integrat...
متن کاملA Competition Strategy to Cost-Sensitive Decision Trees
Learning from data with test cost and misclassification cost has been a hot topic in data mining. Many algorithms have been proposed to induce decision trees for this purpose. This paper studies a number of such algorithms and presents a competition strategy to obtain trees with lower cost. First, we generate a population of decision trees using λ-ID3 and EG2 algorithms through considering info...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005